feat: experimental backends - Crawl4AI, Obscura, and Camoufox loaders#1093
Closed
Ege-BULUT wants to merge 1 commit into
Closed
feat: experimental backends - Crawl4AI, Obscura, and Camoufox loaders#1093Ege-BULUT wants to merge 1 commit into
Ege-BULUT wants to merge 1 commit into
Conversation
Add experimental backends module providing alternative document loaders that can be selected via the node_config 'experimental' key. Backends included: - Crawl4aiLoader: async web crawler with markdown/HTML output - ObscuraLoader: CDP-based stealth browser via Obscura or Chrome - CamoufoxLoader: Firefox fork with C++-level fingerprint spoofing Also: - Add persistent Chrome profile and storage state caching to ChromiumLoader - Add Cloudflare challenge detection with user guidance - Add pytest e2e marker for network-dependent tests - Add optional dependency groups: experimental-obscura, experimental-crawl4ai - Support backend switching in FetchNode via node_config['experimental']
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an experimental backends module to ScrapeGraphAI with three alternative document loaders: Crawl4AI (async web crawler with markdown output), Obscura (CDP-based stealth browser), and Camoufox (Firefox fork with C++-level fingerprint spoofing). These backends can be selected via the new
experimentalkey innode_config.Also improves the core ChromiumLoader with persistent Chrome profile, storage state caching across sessions, and Cloudflare challenge detection with user guidance.
What's included
New experimental backends
Core improvements
experimentalconfig key to route to the selected backend loader.Dependency extras
Camoufox does not require a Python extra (it runs via npx).
Usage
Notes
pytest -m e2e(network access)